CLIR Experiments at Maryland for TREC 2002: Evidence Combination for Arabic-English Retrieval

نویسندگان

  • Kareem Darwish
  • Douglas W. Oard
چکیده

The focus of the experiments reported in this paper was techniques for combining evidence for crosslanguage retrieval, searching Arabic documents using English queries. Evidence from multiple sources of translation knowledge was combined to estimate translation probabilities, and four techniques for estimating query-language term weights from document-language evidence were tried. A new technique that exploits translation probability information was found to outperform a comparable technique in which that information was not used. Comparative results for three variants of Arabic “light” stemming are also presented. A simple variant of an existing stemming algorithm was found to result in significantly better retrieval effectiveness.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TREC-10 Experiments at University of Maryland CLIR and Video

The University of Maryland Researchers participated in both the Arabic-English Cross Language Information Retrieval (CLIR) and Video tracks of TREC-10. In the CLIR track, our goal was to explore effective monolingual Arabic IR techniques and effective query translation from English to Arabic for cross language IR. For the monolingual part, the use of the different index terms including words, s...

متن کامل

Building an Arabic Stemmer for Information Retrieval

In TREC 2002 the Berkeley group participated only in the English-Arabic cross-language retrieval (CLIR) track. One Arabic monolingual run and three English-Arabic cross-language runs were submitted. Our approach to the crosslanguage retrieval was to translate the English topics into Arabic using online English-Arabic machine translation systems. The four official runs are named as BKYMON, BKYCL...

متن کامل

Towards a New Standard Arabic Test Collection for Mono- and Cross-Language Information Retrieval

We propose in this paper a new standard Arabic test collection for monoand cross-language Information Retrieval (CLIR). To do this, we exploit the “Hadith” texts and we provide a portal for sampling and evaluation of Hadiths’ results listed in both Arabic and English versions. The new called “Kunuz” standard Arabic test collection will promote and restart the development of Arabic mono retrieva...

متن کامل

TREC-8 Experiments at Maryland: CLIR, QA and Routing

The University of Maryland team participated in four aspects of TREC-8: the ad hoc retrieval task, the main task in the cross-language retrieval (CLIR) track, the question answering track, and the routing task in the filtering track. The CLIR method was based on Pirkola’s method for Dictionary-based Query Translation, using freely available dictionaries. Broad-coverage parsing and rule-based ma...

متن کامل

TREC Experiments at Maryland CLIR QA and Routing

The University of Maryland team participated in four aspects of TREC the ad hoc retrieval task the main task in the cross language retrieval CLIR track the question answering track and the routing task in the ltering track The CLIR method was based on Pirkola s method for Dictionary based Query Translation using freely available dictionaries Broad coverage parsing and rule based matching was us...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002